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(54) Distributed client-based data caching system and method 

(57) A system and method for enabling data pack- 
age distribution to be performed by a plurality of peer cli- 
ents connected to each other through a network, such 
as a LAN (local area network). Each peer client can 
obtain data packages from each other or from an exter- 
nal server. However, each peer client preferably obtains 
data packages from other peer clients, rather than 
obtaining data packages from the external server. 
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Description' - - ■ > i- .-. -^v ■^vsixo «ri:i-. j - " 

fOuOl] - The preserrtihvention relates toa 

svSern and method of the present invention ehablS'dafe packages to be served to a cl.em through^ extole. nonKleter. 
SSTsU d Peer ciients whteh cachethe data packages, in order to max.m.ze effc^ncy and speed. 

SoST SS^r ^ computers, such as the Fernet or M**""*^ ■ ■ 
to obtain data packages, such as documents; images, messages, data packages or other types of data, fcomremote 
SoCe miTa whi ch are not installed on the client computer itself. Instead, these remote storage med.a we managed 
aS ieraS through a remote-computer, known as a server computer or simply as a "server" (.n the same ve.n, the 
^nT^^L oZ l^o^y a "dionf). The advantage of such a system is that the client computer can 
potential^tain dai from any server on the network. The disadvantage of the system *to"*>™^j£Z~* 
tendS on the network to enable data to be transmitted from the server to the client. Furthermore, rf theload.s not 
^XZS^ZlLn servers on the network, one server may become overwhelmed ^ 
Ssing the speed and efficiency of retrieval. Thus, currently many networks cannot provde rapid and efficient data 

retrieval due to the heavy demands placed upon the available bandwidth. 

oSS Proxy servers are often installed to conserve bandwidth on an Internet connection or on c™nertor« to 
other LANs (tocal area networks). These |iroxy servers cache frequently accessed data, thereby reducmg the load on 
Z r^n^raZ ^Mng demand for bandwidth more evenly across the network. Unfortunately ^ch proxy 

conSrS I Each ^ computer on the LAN 1 has to be'separately conf igured in order to commun.cate wrth the proxy server. 
ScSifluSc^etermlniifc; such that each client must be configured to communicate with each proxy server - 

sebaratelv thus proxy servers have many drawbacks. ....*-*.<.. ^ A 

c^n^l^ate^Hhout^rina any speciallnstallation or configuration. Furthermore, such a ^J° u,d . 
bTdeterministic such that each client could communicate with more than one server accordmg to the load on each 
serSSS^ording to the ******* iteelf " Unfortunately, such a solution ,s not currency ava,l- - 

rooo'si Therefore, there is an unmet need for. and it would be highly useful to have, a distributed clierrt^ased^ata 
Sg system Vnd metKod which enable dala % be stored and retrieved from a plurality of peer clients, or caching ^ 
entities" yet which does not require any special conf iguratibn or installation of separate servers- h . /4 ,. SnaH ^ (tete -. 
^ ^pres^ im^on is of a distribute ciieht-based data caching systemand method wh£ * laWe data 
tobeserved to a clierrtthrcogh a flexible, non^eterministic distributed system of cach,^ 
SfSertcy and speed for serving the docUmertH6 >the clierrt. The caching entities are p 
to ea^iher S* eby reducing the amount olfeafia%ah retired to'obtain"a^ *rom an external server • 
%Sn "Accordirig'to the present inven^bni Were is provided a method for distributing data packages across a net- ; 
woVk the nS featuring an external server foV Irving at least one data package, the external ^^ngadedi- 
Sed serveTTe steps of the method being performed by a data processor, the method comprising the steps 
proSS a p.u4y rfpeer clients attached tb ^network and a list of data packages being stored by each of the plu- 
S of See?clients each data package on tHe list' of data packages having an entry, the entry md-cating a un,que .rien- 

the list of data Dackagesby a first peer client to find ah entry for a data package: and (c) if the entry for the data package 
T^^^StTm^pa^ 6t th^ fWpeer dient. retrieving the data package from the locat.on at another of 
the plurality of peer clients according to the entry for the data package. 

45 [0008] Alternatively, the list of data packages is stored on the external server. .' . ... . . ,.„„ ollMCt 

0009 According to preferred embodiments^ the present invention, the list of data packages is ; stored onat least 
S peeTS^ref^ably. if alternatively the entry for the data package is absentfrom the Bet of data packages of 
tne Jt pei client the method further comprises the steps of: (d sending a request r^J&*2S££^ < 
the first oeer client to at least one other peer client; and (e if a response message is rece.ved by the first peer client 

so ^IhVaT^ 

fioi 01 Preferably the request message and the response message are transmitted to the plurality of peer clients 
%2Lc£STSL^ the request message and the response message are *™ smitt ^^ 
Jems by multicasting. Also alternatively, the request message and the response message are transmitted to the plu- 
« raiitv rrf beer clients bv boiling each ipeer client individually. 

P 11 AlsTairemaSSely and preferably, if the response message is not received from the at least one other peer 
S e n ly th Speer dient'the nithod further cbmprises the step of: (f) obtaining the data package by the M ^eer 
client from the external server. Preferably, the method further comprises the step of sending a response message by 
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the first peer client to the at least one other peer client substantially before the first peer client obtains the data package . 
from the external server. More preferably, the list of data packages is stored on each of the plurality of peer clients, and 
the method further comprises Jhe steps of: (g) receiving the response.message from the first peer client by the at least 
one other peer client; and (hj,ajtering the list of data packages being stpred by the at least one other peei\client for indi- 
cating the location 'bMhetlata package accdrding to the response message. . , .. . ■ t 
[001 2] Alternatively, the list of data packages is stored on each of the plurality of peer clients, and the method fur- 
ther comprises the steps of: (g) receiving the response message from the first peer client by the at least one other peer 
client; and (h) altering the list of data packages being stored by the at least one other peer client for indicating the loca- 
tion of the data package^according to a probabilistic function. , .7 . * ; ■ * - 
[0013] -. Preferably, the probabilistic function is performed according to a set of equations: 
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Old location , " r Pp(x) = l/(generatibri+l) ',. 
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wherein Pn(x) is a probability that the new location is substituted for the old. location, Ro(x) is a probability thatthe old, 
location is retained, and "generation" indicates how many times the location had been previously changed. : , \ t , 
[001 4] Also preferably, an upper limit is predetermined for a number of the plurality of peer clients served substan- . 
tially, simultaneously by the at least one other peer client, such that tf a number of the plurality of peer clients served f 
substantially simultaneously by the at least one other peer client is greater than the upper limit, the method further com- 
prises the step of: (d) sending a busy message from the at least one other peer client to the first peer client. ; . r . 
[001 5] Preferably, the external server is a Web server, and the plurality of peer clients is a plurality of Web browsers. 
[0016] Also preferably, the external sewer is a BackWeb™ server, and the plurality of peer clients is a plurality of 
BackWeb™ clients. ■ ; . v , : - ; .kv-; ; ^ 

[0017] Preferably, the unique identifier for the data package is an^MDS digest of the data package. . , , . 
[0018] According to still other preferred embodiments of the present invention, the step of retrieving the data pack- 
age is performed according to a protocol based on TCP/IP Preferably, ^protocol is HTTP. Alternatively and preferabiy^ 
theprotocolis FTP , / / /v-.^\\ < - s^wS-. :- t -v:..-- 

[0019] Hereinafter, the term "protocol based on TCP/IP" includes^any.suchprptocoj, including but not limited to the ? 
HTTP (hypertext transfer protocol) and FJP (file transfer protocol) pfptpcols. .... r ; f 

[0020] Hereinafter, the term ."data package" ref ers to any discr^ei! r identijfjable unit of data,, including but not limited, 
to documents, images, messages, data packages or any other type of ^ta. iV vV r • - . ... . • vJ . . , 

[0021 ] . Hereinafter, the term ^"computing platform" refers to a partipulag computer hardware system or to a particular 
software operating system. Examples of such hardware systems include, but are not limited to, personal computers 
(PC), Apple Macintosh ™ computers, mainframes, minicomputers and workstations, which are also non-limiting exam: 
pies of data processors for operating a software application under , an ^operating, system. Examples of such software 
operating systems ; include, but are not limited to, UNIX, VMS, Linux, ^MapQS™., DOS, one of the Windows™ operating 
systems by Microsoft Corp. (USA), including Windows NT™, Windows ; 3.x™ (in which "x" is a version number,. such as 
"Wndows3.1™"), t v ^ v \ - — , 

[0022] For the present invention, a software application could be written in a substantially suitable programming 
language, which could easily be selected by one of ordinary skill in the art The programming language chosen should 
be compatible with the operating system according to which the software application is executed. Examples of suitable 
programming languages include, but are not limited I to, C, C++ and Java. * 

[0023] . Hereinafter, the term "broadcast may also include "multicasr as well. , ; , ? . 

[0024] , The invention is herein described, by way of example only, with reference to the. accompanying drawings, 
wherein: 


FIGS. .1 A and 1 B are schematic block diagrams of an exemplary basic system and method according to the present 
invention; . ■ « ■■ .•..■>'■ . 

55 FIGS. 2A-2E are schematic block diagrams of an exemplary request/response protocol and method according to 
the present invention; 

FIG. 3 is a schematic block diagram of an exemplary preferred data-flow diagram according to the present inven- 
tion; , ! , . . \. . ..." ; r .. 


3 


EPC993163A1 

FIG: 4 is aflowchart of a method , for operating t the i system of the presentinventiqn with Web bravysers; and yW * 
FIGS SAand 5B are exemplary request andircspo*se messages according to thepresent inventions x«o , 

[00251 The present invention is of. a distributed client-based data caching system, and^meihod.; which enable data 
to be sewed to a client through a flexible, non-deterministic' distributed system of caching entities,™, order to- J™™* 
efficiency airspeed for servingthe data to the client. The caching entities we peer clients which serve the data to each , 
other thereby reducing the amount erf bandwidth required toobtain data from an external server.' ... -"; J ''• 

[00261 The system and method of the present invention enable clients to share data packages among themselves 
across their local network neighborhood, for, example within a LAN. thereby eliminating the.ne^for.a speciahzed proxy , , 
server Furthermore, the network traffic is.npt significantly affected, since modem network architectures*^ well surted 
for peer-to-peer communications:. Most, currently operating. networks have a star topology, using switching hubs ,<n 
which communication betweentwo peers doesnot affect simultaneous communication among other node? .on the net- 
work. Thus, the system * the present invention overcomes the drawbacks of a proxy server, yet does not add significant 
loads to the traffic on the network itself. . "-^ 

[00271 , For currently available client-server software applications known.in the art. whenever a client requires a date 
package the following algorithm is performed. First, the software application attempts to locate the data package locally 
ShSrJrnw **e data package »not found locally, the software. appl,ca v 

tion retrieves the date package from the appropr«te;server. ■ - _ K^^'nr-opnt 

[00281 By contrast the operation of the,system of the present invention adds an intermediate step. For the present 
invention if the data package is not found locally, an attempt is made to retrieve the data package from a peerclient on 
the local network "neighborhood" before resorting to retrieving the data package from the server. •• 
[00291 • Thus for the system of the present invention, every client actually functions as a caching proxy Once a client, 
reauires a data package.it queries all the hosts.which are actually peer clients, on the local network for that date pack^ , 
age If no neighboring peerclient has the.data package, the client retrieves the date package from the external server 
as usual. However, if a neighboring client already has the required data package, the requesting client will download • 
this data packaae.from.lhe peer client ratbe&ttw •fcomjhe external server. •,. _. • 

[00301 The principles and operation of Ihe distributed client-based data cachmg system according to the present 
invention may be betterunderstood with reference*© the drawings and the accompanying desaiptoa/ . _ _ 

(00311 r Figure 1 A is a schematic block diagram ofr an exemplary system according to the present invention, while 
Figure 1 B is a flowchart of the operation of the system of Figure 1A. Figure 1A shows a system.10 which includes a 
plurality of pee' clients.12; connected by a local network 14 of some type, for example a LAN, indicatea" by^the heavier 
Hne in Figure 1* Two peer clients 12. Iabeied,as/;peer client 1",20 and "peer client 2" 22. are shown for the purposes 
of illustration,only,and without intending to be limitingjn.any way. Each peerclient 12 is also connected to an external 
server 16 of some type by an external connection^ Although only one external server 16 is shown; a plurality^ exter- 
nal servers could also be implemented. External server, 16 is a dedicated server, in the ; sense that th.s serverhas a pri- 
mary or at least a substantially signHicant role as a server for data packages^As shown for the purposes of '""stration, , 
external connection 18 only connects to local network 14 at one point, although multiple such external connections-, 
could also be implemented (not.shown). In additionfcexternal connection -18 could also ppliotially connect.each'peer.cli; • 
ent 12 directly to server 16 (not shown). : ^ > . .. ) •. ■'■•J 1 ''\'''''^'2>,d' 

[00321 .- The operation of system 10 according to:the.present invention is illustrated with reference also to Figure 1B. 
in step 1 peer client 12. such aspeer.ciient 12 looks for;a data package in the local memory or disk cache : of that par, 
ticular peer client 12. If the desired data package is not found on the local disk cache, then in step.2,,peer client 12 que- 
ries any other peer client(s) 12 onlocal network ,1410 determine whether any .other peer client 12 has a particular date 
package For example, peer client 20 could query peer client 22, to determine whether peer client 22 has the desired 
data package In step 3a, if peer client 22 has the desired date package, then peer client 20 obtains the data package 
from peer client 22. Alternatively, as shown in step 3b, if peer client 22 does not have the desired date package then 
peer client 20 obtains the date package from server 16 through external connection 18. Thus.every peer client 12 is 
also potentially a server which is internal to local network 14, and hence could be described as an .nternal.server to 
distinguish peer client l 2 from external server 16.- . . : j . - • ' ■• 1 

[0033] . Each peer client 12 could also.be described as a "caching entity" and the data stored breach client for serv- 
ing to other peer diente 12. as "cached date" or "cached date packages". _ 
[00341 : A number of different possible embodiments of the system of the present invention can be implemented, ot 
which two illustrative embodiments are shown with reference to. the Figures below. Briefly. Figures 2A-2D illustrate an 
exemplary mbodiment of the system of the present invention for implementation with the software application of Back- 
i Web™ (BackWeb Technologies Ltd.. Ramat Gan, Israel) on a local area network (LAN). Figures 4 and 5A-5B illustrate 
an exemplary embodiment of the system of the present invention for implementation with a Web browser software appli- 
cation on the Internet. . ' ' ' . . :„u«u, QO 
[0035] Figure 2A shows an exemplary; local network 24 which features a plurality of peer clients 12 of whichthree 
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are shown for the purposes of discussion only arid without intending r tb^be limiting in any way. For the purposes of dis- 
cussion only, supposes peerclient 26, labeled • A7, wishes to obtain feur data packages "W, "X\ "YV and "Z". None of 
these data packages are local to peer client 26, which must therefore obtain these data packages from either another 
peer client 12 t as\an intennalserver, or from an external server (not: shown). Local area network 24 features two other - 
5 peer clients 12: peer, client 28, labeled "B", And peer client 30, labeled "C". Peer client 26 must therefore firstcommuni- 1 
cate a request to peer client 28 and peer client 30 to see if the desired data packages are available at either location, 
and then peer client 26 must obtain these data packages from peer client 28 or peer client 30 if the desired data pack- 
ages are available. V"f.-;..v ' . * A- 

[0036] Preferably, two protocols are used for communication between peer clients on a local area network (LAN).-a 

10 data package-exchange' protocol -and a control protocol. Specifically, the data package exchange protocol is used to 
transfer data packages between peer clients; 'once the desired data package has been located, and is described in 
greater detail with respect to Figure 2B below. The control protocol enables each peer client to efficiently build and 
maintain tables which describe the location of available data packages across the local area network by exchanging 
messages. - 

is [0037] Each peer client maintains two hash-tables which contain information about data package location: a local- > 
data packages table and a network-data packages table. The local-data packages table is a hash-table of data pack- 
ages which reside on the storage medium or media of the peer client itseH.The network-data "packages table is a hash- * 
table of data packages which reside on the storage medium or mfedia 6i other clients on the Ideal network. This table 
contains the local area network address of the peer client on which each data package is being stored. The size of this* 

20 hash-table is preferably limited in order to reduce memory consumption. More preferably; each entry in the table has a 
time-stamp, such that older entries are purged when the size of the table exceeds the tipper permissible limit ' ■ a - 
[0038] In order to effectively identify the desired data package, preferably each data package has unique identifier > 
or "fingerprint" associated with it. More preferably, this unique identifier is an MD5 digest of the content of the data pack- 
age (for a description of the MD5 specification, which is an industry standard and would therefore be obvious to one of 

25 ordinary skill in the art, see "RFC 1321" at http://dsjntemic.net/rfc/rftf1 32h'txt). ^ 

[0039] Once any peer client knows both the unique identifier and the location of the data package on the local net- 
work, that client can then proceed to download the data packiage. However, the peer client may not know the location 
of the desired data package, in which case the client must follow a ; control protocol according to the present invention 
in order to determine the' location of the desired data package and-to enable the client to build these hash tables with 

30 respect to future attempts to locate a data package. •' v \> ■ * - : , - 

[0040] The control protocol is used to provide each client with knowledge about the locations of data packages 
across the local network. In the preferred implementation illustrated with respect to Figures 2 A-2D, control messages 
are preferably sent and received as broadcast or multicast packets. Local area networks such as Ethernet networks 
support broadcast or multicast packets such that all peer clients ori ^ a Ideal area network receive the broadcast or mul- 

35 ticast packets. Effectively, a single packet can be sent to ail peer clients by using broadcast or multicast, thereby reduc- 
ing the amount of traffic on the -network required as a result of transmitting the request message (see for example 
Chapter 12, -Broadcasting 1 and Multicasting", of TCP/IP' Iltustrated'Vdlumet by W. Richard Stevens, "Addison-Wesley, 
1994) However^ optionally the system of the present invention could' poll each peer client individually with a control- 
message for that peer client, although this is not preferred since such individually addressed messages would consume : 

40 excessive amounts of available bandwidth. In such a situation, preferably: polling would be restricted to a certain group- 
of peer clients as internal servers, in order to reduce the amount of traffic on the local area network. 
[0041] 'For the preferred implementation in which broadcast or multicast is used, more preferably, the decision to 
select either IP multicast or broadcast is made according to the configuration. set by the network administrator for the 
local area network. IP multicast is preferable in terms of load on the cHents of the local network, but may not be supv 

45 ported on all platforms (operating systems) . More preferably, the TTL or Time to Live may be configured. The TTL spec- - 
if ies the number of routers a packet can cross before being dropped. Configuring the TTL enables data package sharing 
to be expanded across subnet boundaries. • 
[0042] As shown with respect to Figure' 2B, the control protocol of the present invention preferably operates as fol- 
lows. In step 1, peer client" A" from Figure 2A looks for a data package on the local storage medium or- media/ In step 

so 2, since the data package was not found locally on the medium ormedia of peer-client "A", peer client "A" must down- 
load the data package and therefore preferably multicasts (or alternatively broadcasts) a requesf* message; A request 
message preferably contains a protocol identifying version number (PVN) for the control protocol of the present inven- 
tion and a list of MD5 digests of the needed data packages, as shown in Figure 2G: 

[0043] Optionally and preferably, if more than one data package* is desired, a list of requested data packages is 
55 included*in the request message rather than a single MD5 digest, in order to reduce the total number of request mes- 
sages on the network. ' ■• ^ • » \ 

[0044] In step 3. the neighboring clients, shown as peer clients M B" and X" in Figure 2A, receive this request mes- 
sage and search for the requested data package in their local-data packages hashrtable. A peer client which does not 
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find the data package locally does not reply, as shown" instep 4a. Otherw.se. .n step 4b the peer c " entls , e ^ s ^f^ 
message, preferably after waitings short rar^^^ 

first More preferably.- the peer client doesnot distribut&the response message if another chert responded prev.ously/.n : 
oSer to reduce unnecessary traffic on the local, area^network. Also more preferably. the,pe e nd l er*d,stnbutes the . 

5 response message by broadcast or multicast- - n '*".:.' •<_?«;!• vc-«o .s „_„ . . V 

[00451 - For^example -as shown in Figure 2 A, if peerclierit "A" requests a data package "Wy. peer 'client B would • 
eply with the' response message: since peer client -B" has the datapackage stored locally: B^ntrast, P^c"'^ C 
would hot reply with a response message, since peer clientX" does ^^^ a ^^ e '^: &or ^^^°"^ 
other hand, if peer client "A" requests a data package "X". both peer client V and peer chert could ^ono.fr th>s . ;: 
w situation-: preferably only peer client "B" or peer: client "C"> would respond, depending on which peer Chert had the 
shorter random interval for waiting before sending the response message. /-„ airwMo Cliri " 

100461 More preferably, responses are sent only for data packages with yet unknown locations. For example, sup- . 
pose client w requests data packages -w* f*v«i and v. Client "B» has <^^^^"^^^.:, , 
first to reply, with a reply message irKlicating possession of data pacte^^ 
is has date packages "X", "Y" and "Z": Since it replied after client "B". the response message from chert C will only indi- 
cate possession of data package "Z" becausfe this is the only data package with an as yet unknown locatioa 
[0M7T A response mVssage optionaliy^t^hs the identifying PVN. the listof MD5 digests of data packages that 
were found and a TCP port number, as shown in Figure 2D. The port number identities on which TCP port the respond- ; 
ing peer client is waiting for data package requests Alternatively, the response message optionally contains other ^indi- 
cators whfch enable the requesting client to retrieve one or more data packages from the respond^ p eer. Pr^erably 
response messages are also be broadcast for data packages which are currently being downloaded from an external 

server, for reasons described in greater detail below. > . ... ^«*, tel i, a idBu a i&^' 

[00481 - in step 5. the peer client downloads the data package or data packages. In principle, according to a relatively 
simple embodiment of the present invention; rat this stage the requesting client either receives a reply and dcwnloads 
the dktapackages from the client that replied; br-if a reply is not received within a certain period of time proceeds to 
download these data packages from an external server. If the peer client is downloading a data package fronr another 
peer client as an internal server, the data package^xchange protocol is used to obtain the data P^e. ^ date 
package^exchange prdtocol is based on some appropriate peer-to-peer communication protocol ,n f '^'"^ . n ^-; 
rted to the HTTP protocol (see RFC-2068, "Hypertext Transfer Protocol - HTTP/1. 1". available from httoV/ds..rter- 
30 n\r. nflt/rfc/ris2068.txt as of September 23. 1998): ' ' . ( s . '■ likt .. _^ * - . 

[0049] Preferably a more complex implementation is employed, since such a simple implementation may cause 
multiple clients to fetch the same data packages from the external server simultaneously. This srtuation would arse if 
several peer clients need to download the same data packages at approximately the same time, which is a very prob- 
able scenario for push clients s for which cortert delivery is triggered by an external server, since none of these clients 
35 would receive a response to its request. Instead, the other clients would still be downloading the data package when 
the new client request is broadcast such that none of them would be ready to serve these data packages. Thus, many 
or even all of the clients would attempt to retrieve the data package from the external server and not from another peer 
client thereby increasing the amount of traffic pn the network and reducing the efficiency of operation of the system of 

40 [OOSOr^PnXrably. the problem is solved by notifying other clients when a first client is downloading the data pack- 
age from the external server, even if the process of retrieving the data package is not yet complete. In this preferred 
embodiment; the first client which, requires the^dSitapackage obtains the date package from th ^ e ™'.^ er - ^ 
clients which require the data package will then dbwnlbad it from the first client even if thefirst chert is still in the process < 
oSeving thtdata package from the external server. The preferred embodiment of the method of the present inven- 

45 tion is described in greater detail with regard to Figure 2E. < " ... . 

[0051] In step 1 ' the requesting client agaih^ransmits the request, again preferably by broadcasting or multicasting, 
and then waits for a response. If no response is received within a certain period of time, in step 2 the chert transmits a ; 
response message as if replying to its own- request; indicating that this client either has the data package, or jn this 
case, that the client is retrieving the data package. In step 3, the client retrieves the data package from the external 


20 


25 


[0052] ' : • In step 4- other: cherts ^ create art-entry in their network data packages hash table, indicating the location of 
the client which will be serving the data package. Thus, preferably only a single client accesses the external server for 

ra053] Ven requeSS i sent for multiple data packages, but a response is received indicating the location of only some - 
55 of the data packages at a neighboring peer client or clients; the client first obtains these data packages from the neigh- 
boring peer client or clients. Next the client then transmits the response message for the rest of the date packages ana 
proceeds to obtain the remaining data package or data packages from the external server. Thus; the client only obtains 
the data package or data packages from the external server which are not available locally, rather than obtaining all of 
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the data<packages from the external server, thereby reducing netvwOrk.traffic. • ? 

[0054]^£9ti£cording to* preferred- embodiments of the present inverrttbn^preferably the process of downloading data 
package from pe*er dients:is optimized to reduce the amount of time'nesquired for downloading,- the load on each indi- , 
vidual client andlheoverall.network traffic. Such optimization is'peilbrmed as follows. > v. 

[0055] First, preferably the exit degree of each client is bound, such that each client is only able to serve a fixed, 
limited number of pther clients simultaneously. More, preferably, the default limit is three other clients, for example,, or 
some another: appropriate number which is preferably configured by the user or by the network administrator. If an addi- 
tional client attempts to download a data package from a client which is already serving the maximum number of other 
clients will receive a "busy". message. This feature limits the load on each individual client.; - - : - 
[0056] . Also preferably, the present invention is able to optimize the selection of the. best client from which the data : 
package should be obtained. For example, if client "A" had already downloaded a larger portion of , the required data 
package than client "B", transferring the data packageifrom client "A" is more optimal. Such clients are preferentially \ 
selected to serve data packages, since these clients will be able to serve the data package after a shorter time period ; 
has elapsed. Such preferential selection, occurs by shortening the time period vfor; waiting before these clients respond, < 
thereby increasing the likelihood that they will serve the data packages.: For this reason; the client preferably calculates 
the random delay before responding such that the delay is inversely proportional to the percentage of the data paekag / 
which has been already downloaded. In addition,: the random delay; is preferably, proportional to the number of clients v 
being served at the moment, in order to decrease the likelihood of overloading . • * ; , - , 

[0057] In addition, according to other preferred embodiments of the present invention,- preferably. the entries of th - 
locations of data packages in the network data packages table are updated according to a probabilistic function. Such 
a function is preferred in order to prevent all of the clients from registering a single client as the server for any particular;,: 
data package, for example. When different clients respond, usually at. different/times, indicating they have. a specific ., 
data package, the remaining clients listening across the networkjjpdate the-entry for this data package in their network 
data packages table, by adding the. IP address, or some other-type of address according to the addressing systerrv . 
employed by the network, of the client which can serve the data package to this table. In a simple implementation, the 
clients would store only the last advertised location of each data package, and therefore many or all clients might ; 
attempt to obtain the data package from a single client as the internal server, thereby overloading that client. ■ ^ t 
[0058] To avoid this situation, preferably the following probabilistic algorithm is used to determine the particular cli- , : 
ent address which is stored in the network data packages table. Each time a new client transmits a response message,>j 
indicating that this client is able to serve a particular data package, the probability that the new. IP address of the.new : , 
client is substituted for the old IP address is calculated according torth^following^uations^ , v*- y***tf 


35 


40 


45 


SO 


55 


> New IP address - 


Old IP address r. ^^P^x) = l/(generadon^-l) 


New DP address Pn(x) = 1 - l/(generatidn+ 1 ) 


wherein Pn(x) is the probability that anew IP address is substituted for. the.old IP address, Po(x), is the probability that , 
the old IP address is retained, viand "generation" is a number indicating how. many times this address had been previ- 
ously changed. ; y r . . . r*vv - tv, - ^ . f 
[0059] For example, if client "A" responds indicating it has data package "X'Vthenrinitially all other peer clients store 
the IP address of client V A" as the location of data package T. If cliert "B" then broadcasts a response also indicating 
that client • B" has data package, "X", then the probability that any one client now changes the IP address for the location 
of data package "X" is 50%. In other words, about half of the clients should now,point to client "A" and about half should 
point to .client "B"., : - t -„ - , - v , ; ■:-*. 
[0060] Such a substantially even distribution of load across multiple clients should produce data-flow with a tree- 
shaped topology, as shown in Figure 3, rather than a random, topology, thus optimizing the average download time and. 
the load on the serving clients. ( - * . - • r . - ■; 
[0061] Furthermore, if any client requests a particular data package during the period required by client" A" for 
downloading that package, preferably client "A" sends a broadcastor-.mutticast message indicating that the package is 
in the process olbeing downloaded. Therefore, preferably only a single clienl "B" polls client "A" for each data package, 
for example. Other clients preferably automatically receive any responses from that polling action though the broadcast 
or multicast transmission, and thus will not be forced to poll for themselves. . - ' 
[0062] The polling (request/response) traffic is optimized, since there is usually no .need to transmit both a request 
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and a response for each data package i^ty^N^*** ^^^Po^^'^^^f t 
deceives* bstantially all of the requesVrespbnse^mfnUnication of allthe Other id.ents and ^members the location 
of the data packages in the network-data package^teble. ■•• ~ V! .J"' ^ ...... t .— , ari i !nar . ■ . 

[00631 ■ As oreviously described, the actual prcfcess for receiving a data, package from an «n»^-8eiver*per- 

SaccS^u^^ 

^S^SSm^^^ data package transfer software application of the present invent.onpreferab.y fea- 
^TSS^SdSecSbn of an aborted'transfer or a very slow data package transfer, for example: The timer deter- 
SSs^&SSnUr has timed out. If a, time-out occurs, the requesting client preferably '^f 
pS^hWfransfeY remains unsuccessful after a' plurality of attempts, the c«ent pr^y^ ^en^to . ( 
the data package from the peer clients the internal server, and instead transfers the data package or data^ 

St e requesting client recedes a message indicating that the data package is not ready, as well as anmdrartwrt 
^^^SSackage already dowhloadedr The requesting client continues polling the servmg client unt,l the 
datlSge SoSs complete, /the Bowhldad becomes substantially slower or is, otherwise interrupted or term,- 
nat d for a l6ng period of time, the requesting diSht behaves as if a time-out occurred /W « ftn(ll(ippr 
ra^sr Ac^Kling to additional preferred features of the present invention,' substantially autor^cck^rtpew 
SSS- *su^:^attb^ each peer dient to detect the presence of otter PJ^rtson. 

2£LJ<Ksuch peered 

St ^s^emSs described above would only prolong the time period required to download a data package .f no 

S e -Pre^y *eamLt of barKiwialh^n thecal area network which is consumed by each , peer client serv- ■ 
^to'|»S^ olr diehte limited, to a^ bver*urd«iing any ^dlfc host This limit is prele^y c*r*gu« f • 
blebvthe user or by the network administrator ' r ... ^^.Ak^' 

W^^^ermdre; in order to protect peer clients from unauthorized access of local storage ^ ftr^frme 
SSSi of tne bresent invention, certairi securrty features are preferably induded. For example, preferably only data 
oaSaae? UeStoZtosU tebles are able to be transferred from the dient Thus, transmitted data packages are 
padiges which were intended to be served to the peer dients. such that malicious users ^r^era- 
bTvSSw^8yslfirh?» the present invention & obtain "random" data packages from the storage > med^of a /peer 
Sler^la^^mDra Preferably ^bniy^ referenced by their unique identifier, such as their 128-brt MDSd^est. 
t^mi^^i^^ to be dbwrilokded froma dient if the intended recipient knows th,s digest. Thus, 
SnanTe is preferably ^su«ident information to permit retrieval of the data package from a 

foS' ie " t; Accirdirig to another embodiment^ the present invention, the system of the present inv ^° n ^^oagl- 
caSe to Web^owsers, FTP clients, and other software applications involving client-server c^ta-tran^er. As descrtoed 
Sefenlnce to Figures 4 and 5A-5B. anothef e^iary embodimert of the present 


..... , . . . .. ..JKmirtr 

content, 


fS - in step 1 of Figure 4. a Web browser beihg operated by a client computer r^uest> a specflib data package. 
Se web Sows* looks at the local cacMi asis khbwn to one of ordinary skill in the art: If the data package is found 

saoe reauestinq this data package, preferably by using broadcast or multicast message transm.ssiort: The date pack- 
fS^U by a unique identifier. More preferably the unjue '^^XZ^ 
package, or alternatively and preferably a combination of the URL of the data package and timestamp. or by any other 

t ^ U T5Sak; rr«ori ttan one data package is required, the Web browse^referaBy one ' 

ejuest message containing the list of needed data packages, thereby reducing the total £ j£ 
work Such a station may arise if. for example, the Web browser had just parsed an HTML (hypertext mark-up Ian 
^^SintSTS^pie. which contains many links to follow. Preferably and optionally, each request message 
Snsa™^^^^^ 

in Fibure 5A the request message includes the list of URL's or other unique identifiers to identify the date package or 
^^^^t^e6-^ is similar in f Jnction to the list of MD5 digests described prev-ously for request 
messages, and a unique identifier identifying the request message, shown as "REQ h 
[00711 In step 2. other Web browsers across the network listen to detect request messages of th.s type. These Web 
browsers, which are peer clients for this embodiment of the present invention, receive this 

their own cache for the requested URL If the requested URL is found in the local cache of a Web browser, that Web 
browser preferably waits a random interval and then preferably transmits a response message indicating , rt t has ; tte 
°eo^ed data pactege (or data packages) . Preferably, the message is broadcast or mu».cast More preferably. tha We^ 
does not reply if another Web browser had replied f irst". A reply message is preferably sent by a particular Web 
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browsei^en if the requested URL is.still being downloaded by l^a^VWeb^brqwser. . , , 

[007?]*,:>o! <n;step ; 3i if no response to an issued request messagetSjreseiyed within a certain amount of time, for exam- 
ple 5 seconds, then the process is preferably timedout. In this case^therWeb browser preferabfy .no longer attempts to 
obtain the,URL.frpm.another,Web browser, and the URL is obtained from,the regular Web server using regular HTTP , 

5 protocol. Before starting to download the data package from the regular. Web server, the Web browser preferably trans- 
mit§_a response message indicating that this particular Web browser is downloading the.data package, v, 
[0073] On the other hand, if a response message is received, the Web browser qbtajns the URL from the other Web 
browser; which indicated that it had the URL in the local cache. Preferably, Web browsers across the network record the 
URLs and the address, from which the response message originated for future use, suchjhat these Web. browsers 

10 would be able to download the URL at a future time without t < 

[0074] Once the Web browser is able to locate a data package on a neighboring Web browser,, the Web browser 
attempts to download the data package. The downloading. process is performed . with a suitable data-transfer protocol, 
such as HTTf? or FTP If a time-out or other failure occurs during the prpcesising of data package transfer, the receiving f) . 
Web browser preferably performs substantially the entire procedure, rnqre, than once. More, preferably, the number of 

75 permitted attempts to retry the transfer is configurable. If the process fails after these attempts have been performed, . 
preferably the Web browser transfers the required data package or data packages from, the, regular Web server. . 
[0075] According to preferred features of this, embodiment of the present. invention, data package downloading is 
well distributed, such that the Web browsers :do not obtain a data package, from only a single Web browser, but rather , 
obtain the data package from a plurality of Web browsers. Such distribution is maintained as follows. 

20 [0076] First, preferably the number of simultaneous data package transfers from a single Web browser is limited. If 
this number is exceeded, the Web browser transmits a "busy" message to other Web browsers attempting to transfer: 
the data package. Next, preferably once a Web browser receives a message giving , the location of a particular data , 
package, /the corresponding entry in the hash table for, that data package is not altered, every, time another response 
message is received pertaining to this data package. The hash table is preferably altered by subsequent messages in 

25 a probabilistic manner, such that the probability that any particular entry is updated to indicate. a new location of a data . 
package is equal to 1/(generatiom-1 ), where 'generation' counts the number of times a response message. was received 
for that data package. , viv- ■/ • ; - .. k 

[0077] ' For . example, if Web browser "A" transmits a response message indicating that data package "X" is on the , 
local cache, then initially all of, the neighboring Web browsers have an entry, in the hash table { indicating^ that- Web f 

30 browser "A" is the location of data package "X". If Web browser "B" then.transqiits a response message for data pack- 
age, "X'\ then each Web browser preferably now alters the eritry t in tjie^hash table to indicate a new location of data 
package "X" with a probability of about fifty percent, such that ^outfff^ percent of the Web browsersrnow .havie an , 
entry indicating that the data package is available from Web browser "A" and such that about fifty percent of the Web L 
browsers, now have an s entry indicating that the data package is available from Web browser "B". Thus, a good load,di£-_< 

35 tribution can be achieved. , t . ^ , ; . ; , , ^fe^ • / : - ■ . . . ..... U, r .;.- C 

[0d78j\..'\the random delay (mentioned in step 2 above) chosen r ^ylajbrow?er is proportional. fo the number of .cur-, 
rently served browsers, or the number of browsers currently downloading data packages from that browser, and . 
inversely proportional to the amount of the data package already dqwnloaded.by it This way the browsers more eligible 
todo^iload from are, more lite^ , 

40 [0079] While the invention has been described with respect to a limited number of embodiments, it will be appreci- r 
ated that many variation^, modifications arid other applications of the Jnventiori may be made. ' r 

Claims ~ , : . ' , . •..*..■**. r ^'.^\:^:l-r^' . 

45 1 . . A method for distributing data packages across a network, the network featuring an external server for serving at 
least one data package, the external server being a dedicated server, the steps of the method being performed by 
4 a data processor, the method comprising the steps of : ., . ( > . . , 

(a) providing a plurality of peer clients attached to. the network and providing a list of data packages, said data 
so packages being stored by each of said plurality of peer clients, each data package of said data packages hav- 

ing an entry in said list, said entry indicating a unique identifier for said data package and a location of said data 
package in at least one of said plurality of. peer clients; r - : ... . . . ti . 

u . , . . (b) examining said list of data packages by a first peer client to.find an entry for a required.data package; and 
(c) if said entry for said data package is present on said list of data packages of said first peer client, retrieving 
55 said data package from said location at another of said plurality of peer clients according to said entry for said 

. data package. 

2. The method of. claim 1 , wherein said list of data packages is stored on at least said first peer client. 
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3 The method of claim 2. wherein alternatively said entry for sad data package is absent from said list of data pack- 
ages of said first peer client, the method further comprising the steps of: 

(d) sending a request message for said data package by said f irst peer client to at least one other peer client; 
(eTif a response message is received by saidfirst peer client frorr , said at least one other peer ciient retrieving 
said data package from said at least one other peer client by sa,d f .rst peer cl.ent. 

4. The method of claim 3. the method further comprising the step of: 

(f) altering said list of data packages being stored by at least said first peer client for indicating said location of 
said data package according to said response message. 

The method of Cairn 4. wherein if said response message is not received from said at least one other peer client 
by said first peer client, the method further comprises the step of: 

(g) obtaining said data package by said first peer client from the external server. 

server. 

The method of daim 6, wherein said list of data packages is stored on each of said plurality of peer clients, the 
method further comprising the steps of: 

tion of said data package according to said response message. s 
30 8. The method of claim 5. wherein said list of ! data packages is stored on each of said plurality of peer clients, the 
method further comprising the steps of: 

35 tion of said data package according ito a probabilistic function. ;l " . * 

40 method further comprises the step of: 

(d) sending a busy message from said at least one other peer client to said first peer client. 

10. The method of claim t. wherein the externa, server is a BackWeb™ serve, and said plurality of peer clients is a 
45 plurality of BackWeb™ clients. 

1 1 . A system for distributing data packages across a network according to a list of the data packages, the system com- 
prising: 

(a) an external server for serving at least one data Wckage. said externa, server being attached to the network; 
E? a oluralitv of peer clients attached to the network, the data packages being stored by each of said plural^ ity 

peer clients, such that each peer client retrieves a data package accord.ng to ^T^^SZ 

rality of peer clients. 


so 
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